Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 387 resolved #395

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

supercoder-dev
Copy link

To solve the problem of the head dimension exceeding the shared memory limit, we need to add a check after the line where d_inner is calculated. If d_inner exceeds a safe maximum value, we should set it to that maximum value.
To solve the problem of the head dimension exceeding the hardware limits, we need to add a check in the __init__ methods of both MixerModel and MambaLMHeadModel classes. This check will ensure that the head dimension (d_model) does not exceed a certain limit. If it does, it will adjust it to the maximum allowable value based on the hardware.
To solve the problem, we need to add a parameter to configure the head dimension (headdim) and ensure it is set appropriately. We also need to validate the head dimension to ensure it does not exceed hardware limits. Additionally, we need to adjust memory allocation and kernel function calls to use the configured head dimension and ensure memory usage is optimized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant